Linear-Time Construction of Suffix Arrays

نویسندگان

  • Dong Kyue Kim
  • Jeong Seop Sim
  • Heejin Park
  • Kunsoo Park
چکیده

The time complexity of suffix tree construction has been shown to be equivalent to that of sorting: O(n) for a constant-size alphabet or an integer alphabet and O(n logn) for a general alphabet. However, previous algorithms for constructing suffix arrays have the time complexity of O(n logn) even for a constant-size alphabet. In this paper we present a linear-time algorithm to construct suffix arrays for integer alphabets, which do not use suffix trees as intermediate data structures during its construction. Since the case of a constant-size alphabet can be subsumed in that of an integer alphabet, our result implies that the time complexity of directly constructing suffix arrays matches that of constructing suffix trees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Suffix Trees and Suffix Arrays

Iowa State University 1.1 Basic Definitions and Properties . . . . . . . . . . . . . . . . . . . . 1-1 1.2 Linear Time Construction Algorithms . . . . . . . . . . . . . 1-4 Suffix Trees vs. Suffix Arrays • Linear Time Construction of Suffix Trees • Linear Time Construction of Suffix Arrays • Space Issues 1.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

متن کامل

Advanced topics in algorithms

Lowest common ancestor algorithms are in [12, 19, 2]. Algorithms to construct suffix trees in linear time are in [22, 18, 21, 5]. Suffix arrays were introduced in [17]. The linear time construction algorithm for suffix arrays is from [14]. The simple construction of the LCP array from the suffix array is from [15]. The k-mismatch problem is discussed in [7, 16, 1]. The FM index is from [6]. Som...

متن کامل

The Virtual Suffix Tree: An Efficient Data Structure for Suffix Trees and Suffix Arrays

We introduce the VST (virtual suffix tree), an efficient data structure for suffix trees and suffix arrays. Starting from the suffix array, we construct the suffix tree, from which we derive the virtual suffix tree. The VST provides the same functionality as the suffix tree, including suffix links, but at a much smaller space requirement. It has the same linear time construction even for large ...

متن کامل

Faster Suffix Tree Construction with Missing

We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new backpropagation component to McCreight’s algorithm and also give a high probability hashing scheme for large degre...

متن کامل

Scalable Parallel Suffix Array Construction

Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. We describe the first implementation and experimental evaluation of a scalable parallel algorithm for suffix array construction. The implementation works on distributed memory computers using MPI, Experi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003